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ABSTRACT 

This paper investigates the effects of school cn 
learning as exhibited in this research. In all three studies, there 
were regressional analyses carried out in vhich the total variation 
to be accounted for was that between school averages, and regression 
analyses in vhich the total variation to be accounted for was the 
full variation between individual students. In examining overall 
effects of school variables, the between-student analysis should be 
used even though it includes a lot of variance due to individual 
differences within schools. It appears tetter to rf'cognize that 
school variables can account only for that fraction of the total 
variance that lies between schools, and tc use the between*student 
analysis for the study of effects of school variables. In the use of 
the between-student analysis, six countries are focused on which 
engaged in testing in literature, reading, and science: Chile, 
England, Finland, Italy, Sweden, and the United States. In addition, 
the analysis is confined to the 10 and 1U year-olds. Several general 
results are useful to state. For all three subjects, the total effect 
of home background is considerably greater than the total direct 
effect of school variables. Comparisons between ages 10 and 14 show a 
slightly higher total direct effect of schools at age 14 than at age 
10 but a smaller proportion of it independent of hone background. The 
data show fairly conclusively that reading achievement is irore fully 
an outgrowth of home influences than are either cf the two subjects, 
less a function of what takes place at school. (Author/JM) 
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Effects of School on Leaning: The IFA Findings 
Janes Coleman 
University of Chicago 
• One^s first response to the TEA publications on science education, 
reading comprehension, and literature must be one of amazement and respect: 
amazement that such a massive set of studies of cross -national achievemeiit 
of children could be successfully carried out, and respect for those who 
have done so. is important to record this first impression, because the 
studies constitute the best models in existence for cross-national research 
on social institutions and social behavior. This fact should not be lost 
in the detailed comparisons, secondary analyses, and critiques of the three 
studios. With their publication, the comparative study of the functioning 
of different societies has made an isfiortant advance. 

This paper, however, is not directed at the most salient results of 
the research, the differences between cotmtries and the differences between 
siAJeets, but at a different question: the effects of school on learning as 
ejdiibited in this research. This question could - and has - been asked 
through research within a single country, on a single subject-natter, and 
at a single age. But the special virtue of this research is that it covers 
a nuober of countries, three age levels, and three subject-matters. For 
eadi of the subject-matters separately, the lEA authors have themselves 
examined this question. Indeed, in all three studies, this question occupies 
significant portions of the book. Beyond this, however, there are some 
things to be learned from comparing the separate studies, and I will attempt 
SODS such comparisons. 

I will divide lay comments into two parts, the first methodological and 
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the second siAstantive. It is mfortunatc that the first is much the largeXt 
for this is an indication that we are still in the early stages of such 
work, where methodological issues, rather than substantive results, constitute 
the largest portion of our discourse* Those whose interest is wholly In the 
substantive results can turn directly to Part II on page 35. 

I. Methodological Issues 
In all three of these studies, a particular strategy was employed in 
evaluating the relative inportance of different classes of variables. In 
^both the between- school analyses and overall analyses, variables were 
divided into four '"blocks," labelled Blocks 1, 2, 3, and 4. The blocks are 
Tou^ly defined as follows (the operational definitions differ somewhat 
ftOB study to study, from between* school to overall analysis, and from 
Population I to II to IV, but the intent is the same in all cases): 
Elock 1 Home background, including age anH. sex of child* 
Block 2 Type of school and type of program, for all countries and 
frade levels in which there was differentiation of program or of school or 
both« 

Block 3 School and Instructional Variables 

Block 4 ^'Kindred" variables, that were not seen as either necessarily 
prior to achievement nor necessarily consequent upon it, but as possibly 
either. These were variables such as interest in the subject and motivation. 

Interest was centered primarily on variables in Block 3, because they 
are the variables which can be affected by educational policy and practice. 
However, interei^t was also great in Block 2 variables, for somewhat differ- 
ent reasons. Since the type of school a student is in and the type of 
program he is in (when there are differing school types and differing 
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programs I as there were iii most of the countries) are ordinarily determined 
on the basis of his performance up to that point, the measure for type of 
school and type of program shows primarily the degree of differentiation 
between high- and low-performing students in the system. For exainple, in 
the still highly selective secondary system of England, the type of program 
! and type of school accounted for 17 per cent of the total variance among 

the 14 year olds in science, 13.7 per cent in reading, and 11.9 in litera- 

- ture, the highest aTnong countries in science, second highest in literature, 
and fourth highest in reading, TIius the interest in Block 2 variables is 

^ also related to educational policy, but more nearly to policies of differen- 
tiation or selection than to policies designed to directly increase learning. - 
But more of that later. 

The Interest in Block 1 variables is not quite so direct, except In 
Pq)ulation IV, the last year of secondary school. In the lower grades where 
all students (in the developed countries) are still in school, the variation 
in family backgrounds reflects the variation in family resources throughout 
the country, weighted by families' fertility. The size of the effect of 
this set of variables- is thvs a product of the variation in family resources 
(that is, the degree of inequality) in the countxy, and the transformation 
of those family resources into (or the effect of that inequality upon) the 
child's cognitive achievement. Confounded with these two variables is the 
differential success in measuring the family resources or home backgrounds 
in different countries. Again, I shall return to this later. 

Interest in Block 4, the "kindred" variables, is somewhat less than in 
the other three blocks for the authors, and will also be of less interest 
to us here, singly because they are not regarded as wholly causes of cogni- 
tive achievement, nor consequences. In further investigations, they may 

ERLC 
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vcxy well be of interest as dependent variables themselves, because as the 
literature study shows (p. 284), interest varies widely among countries, 
and also may well be a function of certain school variables. One such jinves- 
tigation is reported in the literature study (see pp. 410-417) • But here, 
as for the authors themselves, the principal focus will be on cognitive 
achievement, and while the kindred variables may themselves be important to 
that achievement, there is nothing in these analyses that would allo^v such 
an inference, and th^y will be, for me, nuisance independent variables, to 
be disposed of without fanfare. 

Now the reason for separating these variables into blocks of this sort 
was to bring some kind of order into the regression analyses. The problems 
confronting the analysts were very great, and the number and variety of 
variables potentially affecting achievement were enormous. Complex analyses 
prior to the regression analyses themselves were necessary merely to create 
a reasonably small set of variables without throwing out variables that 
W0re important in their effects on achievement. I will not dwell on that 
elaborate process of data reduction except to commend the analysts. 1 
thought that in Equality of Educational Opporttmity our task of analyzing 
different racial groups and different regions of the country was a massive 
one; but it is dwarfed by this. 

Z will go on, however, to be imgracious by commenting on a few difficult 
ties and disorders that still remain. These difficulties can all be summa- 
rised by saying that the process was not quite finished. The three studies 
used, in the end, procedures that were sufficiently different that exact 
conparlsons cannot be made across studies, nor even across the b etwe en « school 
•ad between*student analyses in the same study. For example, in literature, 
different variables, from a limited subset, were allowed to enter the analysis 
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for different countries, based on F- ratio, while in Science, the same 
variables were used for all countries. Again, in Science the between*school 
anal/ses used different weights for different covffi tries in creating com- 
posite variables for the between-school analysis, but the same weights over 
all countries in the between-student analysis. Probably most exasperating 
to the reader is the fact that the reporting differed quite sharply for 
the different scfiools. Sorac of this rierely created inconvenience, as the 
difference between Reading's use of mltiple correlations and Science 
and Literature's use of squares of multiple correlations, as explained var- 
iance. More serious is the fact that non- comparable data are reported^ 
which cannot be recovered: Literature reported (in an appendix) five measures 
for each individual variable within each block, for the between student 
anal/sis, Reading reported two measures for each Individual variable in the 
between-school analysis, and Science reported no measures for individual 
variables in either analysis, except for zero-order correlations. And in 
the between-school analysis, Reading did not separate out, in its reporting. 
Block 1 variables from Block 2, so that it is not possible to examine the 
variation in perforr.ance between children in different types of progrartS 
or schools, for co?nparison with the other two studies (see Chapter 7). 

But all these are mnor quibbles, some caused by nothing more than xny 
difficulties in finding where coxnparable data were treated in the three 
studies. I want to discuss serious questions of the methodology used in 
the regression analyses. For unless it is fully clear what is done through 
use of this methodology, it will not be possible to draw substantive con- 
clusions about the effect of scJiools on learning. It will be ny contention 
that the authors were not fully clear about what they were doing, and that 
this lack of clarity Ifias led then to carry out analyses that prevent one 
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from answering certain important questions about the effect of schools on 
learning. 

The problem that I want to examine goes to the heart of the procedures 

used in the regression analyses in all three studies > in both the between- 

school and betwecn-studcnt analyses. This is the way in which blocks of 

variables were entered into the analysis. The blocks were used not merely 

to group variables into sets that were similar in type and interpretation, 

but also to allcv/- a sequential order in the regression analysis* Block 1 

variables, home background and age and sex of child, were entered first, 

and measures were reported for a regression equation including only thezn» 

Hie measure was e:q;>lained variance or its square root, the multiple correla** 

tion coefficient, and in some analyses, more detailed measures of variables 

within this block. Block 2 variables, type of nrogram and type of school, 

were entered second, and measures were reported for them using equations 

containing Block 1 and Block 2 variables. The measure was the increment to 

explained variance, a measure obtained by subtracting the explained variance 

2 

with Block 1 variables alone (R^ ) from the explained variance with Block 1 

2 2 2 

and Block 2 variables ' (Rj 2 )f that is, - Rj^ • Next was entered Block 3 

variables, school variables of numerous sorts, in an equation including Block 

1 nnd Block 2 and Block 3 variables. The measure reported was analogous to 

that of type of school and program, that is, the increment in variance 
2 2 

explained, Rj23 - ' 

After this, different studies did different things, all entering the 
kindred variables (Block 4^, and Literature going on with Blocks 5 and 6. 
But these need not concern us* 

The major question is an obvious one: what kind of inferences were 

2 2 2 2 2 

drawn from the measures reported (Rj , - » ^12Z * ^^12 ^* 
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what kind can be properly drawn. First, note that the measures are asymneti-ic 
for the three blocks. Despite this, inferences were made in each of the 
studies about the relative effects of hon^e background and school variables, 
i.e., Blocks 1 and 3 (pages 151-2, 184 in Literature, p. 235, 238, 298-9 in 
Science^ p. 98-100, 123-2 in Reading). In all three studies, the inferences 
were that home background (Block 1) is more importar-t than school variables 
(Block 3) and that scliool variables showed little effect. The terms used 
to describe school effects varied from stud/ to study, but the characteri* 
zation of school effects on the basis of these analyses had a certain 
consistency: ^'disappointing,*^ (p. 298, Science), **largely negative results*' 
(p. 100, Reading), '^there is little the school seems to be able to do to 
enhance or inhibit*^ (p. 184, Literature). 

To be sure, the quantitative results appear to bear these statements 
out, and as in earlier studies, the home background variables appear much 
stronger than the school variables. Furthermore, I am sure that, just as 
other studies have showed, the home background variables would still have 
showed great power and the school variables still been ^'disappointing** if 
the analysis had been more symmetric. All regression analyses I have seen 
on these questions, analyzed in whatever way imaginable, would have shown 
this. But the fact remains that these comparisons are made on the basis of 

an analysis that is very asymmetric. The measure for the Block 1 variables 

2 2 2 

is Rj , while the measure for the Block 3 variables is Rj^js ' '^12 (y^^^^^ 

2 

I shall call in pages following, a^) . It is not the case that R^ is com* 
pared with , or aj (« Ri23^' ^23^^ ^^^^ *3' fixed order is maintained 

throughout. 

The rationale for this fixed order is best explained in the between- 
school analysis, and in all three studies a similar rationale was given. In 
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Scier.ce and Literature, a yacht race analogy was used to justify the fixed 
order, and I will reproduce this jtastifi cation. Gilbert Pcaker is responsi* 
ble for the yacht race analogy, and one can detect his fine hand in shaping 
the analyses generally. In my estimation, it is well that this was the 
case, for there are few if any working analysts of school data for whom I 
have more respect- and admiration than Gilbert Peaker, and I can easily 
imagine the morass that these analyses might have fallen into if they had 
not felt his guidance. 

Nevertheless! I want to take issue with the iise of this asymnetric 
iqpproach for comparing the relative effects of home background and school 
variables, and to suggest just what kind of inferences can properly be 
drawn from the analysis as carried out. The simplest way to do so is by 
use of tJie yacht race analogy. 

In the attenqpt to discover effects of school factors on achievement, 
perhaps the principal villain is the fact that student populations in 
different schools differ at the outset, and because of this difference, 
it is not possible merely to judge the quality of a school by the achieve- 
ments of the students* leaving it. It is necessary to control in some way 
for the variations in student input with which the teachers and staff of 
the school are confr^ted. In some way, it is the increment in achieve* 
wnt that the school provides, which should be the measure of the school *s 
quality. If we had good measures of that increment, as well as good . 
■easures of the level of various school resources in the same school* it 
would be possible to establish a relation between the size of the increment 
and the level of certain resources, and thus to determine which school 
resources were most important to learning. 

The problem lies in establishing the appropriate baseline so that some 
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estimate of the achievement increme:^c can be made, and most cross-sectional 
studies have, like this one.» ccremp.ted to use factors in the student's own 
background or possibly in the community which can provide an estimate of 
the student inprr to the school and thus allow an estimate of the incren^ent 
of achievement. 

The yacht race analogy is this: in a yacht race, in order to give all 
boat crews a chance that is independent of the size of the boat, size of 
sails, and otlier dir^car;ions of the boat, a foi\Miula is used that j;ives each 
yacJit a handicap, based or; the expected or average performance of yachts 
with those dimensions. This places all boats in an equal starting position, 
so to speaks and the crews have equal chances to win, even if their boats 
sails are small. 

- Siuiilarly, thinking of the children's performance in analogy to the 
yacht^s performance, we must recognize that because of the effect of back- 
ground and other factors, different children have different expected perfor- 
mance apart from their school- Then if we are to obtain a measure of what 
the school does (in analogy to the crew] to increase this performance, we 
must ^ive each school' a handicap score which is based on the expected 
performance of students like those in this school. 

On these grounds, a school handicap variable was constructed, based 
primarily on the family background characteristics of students in the school 
In addition to the school handicap score, other variable*-; which differed 
from school to school, such as sex of students and age, were included in 
Block 1 because of the different exp^ected or averagij performance of students 
of different sex and different age.* 

• Age in years was held constant in the analysis; but apt* in months within 
that year showed some positive relation to performajice iu :nost of the analys 
Boys performed much better in science and girls nurh be;tte:r in literature. 
In these two subjects, sex of the student was one of tho most important var- 
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The analogy appears a good one. But there is one problem, if we are 
to start the schools out on a conpletely equal footing for the analysis of 
Block 2 and Block 3 variables. This is, how do we determine the size of 
the handicap? In handicapping yachts, the size of the handicap is determined 
by racing times of yachts of given dimensions with a variety of different 
crews, to determine the average or expected performance for a yacht of those 
dimensions • Suppose, hcKcver, yachts with larger sails tended to be inanned 
by better cxov;s^ so tiial the perfoxynance of those yachts, averaged over K'^any 
crews, included an increment of performance due to the better crews, rather 
than only an increment due to the larger sails. Then the handicap score 
will be wrong, because of the correlation of good crews with larger*sailed 
boats. The average performance of larger*sailed boats will be overestimated, 
and the subsequent performance of crews sailing those boats will be under- 
escinated. It is only because a hancicap score for boats with given dimen- 
sions can be made independent of the quality of the crews that the handi- 
capping works correctly. 

But the hypothetical possibilities I have described for the yacht handi- 
capping is just what exists in the schools. There i^ a correlation between 
student input^ as approximated by home background variables, and school 
resources, such as teacher quality and the iilce. Consequently, to develop 
a handicap for the schools by first accounting for all the variance possible 
through home background variables (entering Block 1 first) extracts out not 
uerely the variance due to student input, but also that due to the school 
variables that are correlated with student input. Then to compare the effect 
of school variables with that of hone background variables is not appropriate. 

tables in the analysis. (Jn Reading, no separate report was made for sex 
except in the zero-order correlations, whicli showed small correlations, favori 
girls in 33 countries and boys in 14.) 
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I am not an advocate of path analysis, but I can use a path diagram similar 
in principle to that presented for Japan (p. 280-284, Science) to illustrate 
the point. Assume a home hackground variable (i^lock 1), a school type var- 
iable (Block 2), and two school variables (Block 3). 
Block 1 Block 2 Block 3 

^^3 



School 
Type 



Teachers 
Educat 



on 



. Home 
Background 







°^ZH 




School 




b\ 


^1* 


> 


Achieventent 



Hours Home- 
work Assigned 



'3H 



The diagram indicates the causal reasoning behind the sequence of blocks, 
and I have no quarrel whatsoever with this set of £ priori causal assump- 
tions. All analyses of school effects that I know suggest this kind of 
scheme^ in which home background precedes, in a causal sequence^ both the 
type of school and particular variables of the school, and in which the 
type of school (in a differentiated system) precedes the particular school 
variables. 

An alaysis to identify the relative sizes of the causal flows labelled 
by a^j and b^^^ in the diagram would require four regression equations: 

a) sdiool type as dependent variable and home background as independent; 

b] Teachers* education as dependent, school type and home background as 
independent]; c) hours homework as dependent, school type and home background 
B5 independent; and finally, d) school achievement as dependent, and all 
four others as independent. These do not, with the exception of id), 
correspond to the equations used in the analyses under discussion. But no 
matter. What we want to see is just what the analyses under discussion 

did correspond to. 

1) First, an equation with Block 1 as independent and school achievement as 
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dependent shows the total effect of the Block 1 variable, through these 
paths: direct: a^^; 2-step: ^^2^2A' ^13^4* '^13^34' 3-step: ^12^^^^ 
and ^i2*^23**34* variance explained by Block 1 is explained throu^ 

these six paths. 

2) Second, the equation with Block 1 and Block 2 shows the effect of three 
paths from equation 1 (a^^, ^13^34' ^13^34^ ^^^^ those from school type, 
both direct and 2-step: direct: a^^; 2-step: ^2i^3A ^23^34* '^^^^ ^'^^ 
Variance accounted for in equation 1 is subtracted out, what is left is all 
due to school type, that is, aj^j a2^^^^, and b^jb^^: that due to a^^, to 
^13^34^ and to ^i^^ subtracted out. But also subtracted out is a 
portion of the variance that operated through sdiool type: a^ja^^, *12^23*34* 
and ^12*^23^^34' '^^^^ paths are much less strong than aj^, ^23*34 *^23^34 
whenever is itself small; and it iisually is not very large. Never- 
the less » some portion of the effect of a^^* ^23^34* ^23^34 subtracted 
out, the portion depending on the size of a^j* 

3) The third equation includes Block 1» 2, arid 3 variables, and includes 
variance from two paths in equation 2 (a^^ and sl^^) plus that from Block 3 
variables: a^^ and b^^. When the variance accounted for in eqiiation 2 1$ 
subtracted out, what is left is all due to Block 3, that is a^^ and b^: 
that due to a^^and a2^ is subtracted out. But also subtracted out is a 
portion of the variance due to a^^ and b^^. Variance due to all these 
paths is sttotracted out: ^13^34* ^23^34' ''23''54* what is 
left aft;er equation 2 variance is subtracted out is var (flj^) • var tajja^^) 

- var (a23^34^ * ^^34) • ^*^13*'345 " ^23^34^* ^° rising 
var ( ) to mean the variance due to a given causal relation.) 

Now^ let us summarize what variance is contained in the numbers reported 
for Blocks 1, 2, and 3 in these studies: 



Block 3 

R 2.R 2 
'^123 '*12 

var (83^) - var (a^ja^^) 
var (bjp - var Cb^jbj^ 
- var (ajsaj^j 
. var (b23b34) 

This diart shows the hsymratiy introduced by the procedure used in. these 
studies. Note that asyintetiy exists in two ways: the variance for Block 1 
includes all the variance due both to the direct path and all indirect 
paths; the variance for Block 3 not only is limited to that from direct 
paths J but excludes that due to indirect paths from earlier steps. 

Mow there is nothing wrong with such asynnetry, but It is important to 
be aware of its iinpli cat ions. TWo of its inportant implications are these: 
1. It is not possible to make a coiqjarison of the amount of variance 
•ccmaited for in different blocks, and say, for exanple, that Block 1, home 
background, accounts for much more variance than Block 3. Tlius the state- 
aents in these studies about the small effects of school variables coitiipared 
to the effects of home backgrotnd, which "account for much of the variance,*' 
(p. 152, Literature), are of ''decided importance," (p. 100, Reading), or are 
*Vluite considerable," (p. 235, Science), are not appropriate statements on 
the basis of the analyses reported. The statements are very likely true, 
as inspection of the zero-order correlations, the path analysis for Science 
In Japan, or any of numerous other indicators suggest, but in order to make 
■n explicit comparis.on, a symmetric analysis would be necessary - for 
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Block 1 Block 2 

"1 "12 
var (a^^) 

var (a^2*24^ (*24^ ' ^'l2*24^ 

var (ajjajp 

var (bjjbj^) 

var i^^2^2shA^ f^23^34^ " ^^12^23^34^ 

var (ai2^23b3^) var (h^^h^^) - var (a^2^23^34^ 
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eX2!!ple» a co!OTonality analysis of the variance ^^j2J * showing both the unique 
variance and total variance accounted for by each block in that equation. 
As It stands, the variance estimates at Block 1 are of total variance due 
to Block I variables, while those at Block 3 are of the unique variance 
due to Block 3 variables, with those at Block 2 being sonvevhere in between. 

But it is periiaps not even appropriate to want to make such synmetric 
con^)arisons between "effect of home" and "effect of school" if one has in 
oind the Vind of causal diagraTn as shown above, because the hoRC and school 
^occupy qiute different places in it: hoTise variables partly determine school 
variables, but not the other way around, hliat one night want to specify 
for Blocks 1 and 3 are these effects: 

1) the total effect of variations in Block 1, (hone background) variables 
both through its iinpact on school variables and independent of those 
varlAles. 

2) The total direct effect of variations in Block 5 (school) variables, 
whether this effect is merely iinplenenting the force of home back- 
ground (paths ^i2^2Z^5A^ ''13^34' *12*'23^54' ^13^34^ independent 
of it. This may be thought of as the potential direct effect of 
sdiocl variables if they were distributed independently of home back* 
ground. 

3) The direct effect of variations in Block 3 (school) variables as 
distributed independent of or over and above the force of ho»e back- 
ground. 

4) The direct effect of variations in Block 1 (home background) variables 
apart froa the force of home background in shaping or selecting 
schools. 

Perhaps other effects are desired as t*ell, but these appear to me the inost 
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inportaitt. One night then want a) to coT!7>arc effects 2 and 4 to compare the 

direct effect of school with the direct effect of home independent of school. 
But if one did so he should realize that this does not express the total 

effect of hone, because the home acts to determine the school variable them- 
selves. Consequently, one might vant b) to compare 1 and 2, the total effect 
of variations in home back- round with the total direct effect of school. 
• One might also want c) to conpare effects 1 and 4, to deteminc the propor- 
tion of the hone's effect that takes place through shaping or selecting the 
school the child goes to, and the proportion that takes place directly 
through its intact on the child himself.* And one might want to d) compare 
effects 2 and 3, to examine what proportion of variations in the school's 
ii^tact is distributed independently of differences in family background^ 
coe|>ar€d to the proportion that merely reinforces those differences. In 
addition, one might want e] to ccnnpare effects 1 and 3, recognizing that 
these are total effects of home background variations^ but only partial effects 
of school variations. The idea here might be to compare the total effects of 
fasily variations on achievement with the effect on the child's achievement 
of other variations in society that act through the schools but independent of 
the home.** Finally, one of the nost inf>ortant comparisons one might want 
to make lie within a block: for exaii^le, one might want f) to compare the 
relative sixes of the total direct effects of different Block 3 variables 

* Mien Block 2 is included in the consideration, for differentiated school 
systems further possibilities exist, some of which may be quite important: 
for example, the effect of home (through whatever means, including direct 
effects on the child's performance) in determining the type of school attended 
v?$. the home's effect apart from this; the effect of the type of school in 
itself on achievement, apart from the specific resources that exist in that 
type of school (for exanple, through the selected student body) vs. the effect 
of type of school throu^ the school resources it provides. 

This is not the same as comparing the total effect of the family on the 
child's achievement with the total effect of the other aspects of society 
through school on his achievement. Since we arc not testing the absence vs. 
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{cespariscns within ^2) abcvc), or ccrparison of the relative sizrs of tiie 
effects of different Block 3 variables that, as distributed, are independent 
of the force of hose background Cconparisons within (3) above). 

That is, given a set of a priori assumptions about causal flows, the 
kind of comparisons one wants to make may ver>' well not be symraetric, and 
it is not reasonable only to think of coniparing "home background effects" 
- and "school effects" without further specifications. 

Now given the kind of analyses that have in fact been carried cut in 
these studies, which of these coBiparisons is possible? What has been meoLSured 

through the sequential introduction of blocks^ and the reporting of R^^, 

2 2 2 2 

^^12 " *1 • ^^123 ' ^^12 effects (I) and (3) listed above. Consequently, 

It is possible, among countries to show two things, if we first consider 

blocks as wholes and considering only Blocks 1 and 3, neglecting Block 2 for 

the present, or combining it with 1: the total ef fect of howe background, 

independently of whether it operates through shaping the school or shaping 

the child, and the effect of school variables independent of the force of home 

background. Of the various coinparisons, (a) through (f) above, which of 

them are possible with these data? Only two, (e) and (f) . 

Nw this is not disastrous in any way. If the analyses had been carried 

out so that all coinparisons (a) through (f) could be made, then there would 

have been far more to do than possible within the confines of these volumes. 

iut it is inq;>ortant to recognize what kind of comparisons the data do allow, 

the presence of the family or the absence vs_. the presence of other societal 
forces, all that is possible is testing the total effect of variation vith- 
in families vs . the effect of variation within schools induced by other 
societal forces. An exairple of a case in which the other societal forces 
were almost wholly absent was in Prince Edward County, Virginia^ several 
years ago, when the public schools were closed for several years. What 
happened then was an intensification of the effect of home background. 
Families with any financial resources to spare, cognitive skillsof their 
own to transmit to their children, or interest in their children's educa- 
tion used their money to create private schools for their children or to 
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SO that the inferences drawn are not incorrect • 

Everything I have said so far has been a necessary methodological 
preface to any discussion of substantive results about effects of schools 
on leaxning as shewn by these studies. For until it is clear what infer* 
ences arc possible with these data and what are not, then it is 

not reasonable to begin to examine the sabstantive results. But before 
beginning this exandnation, it is necessary to carry further the methodo- 
logical questions, asking about the use of 'Variance added" as a measure of 

the effect of a class of variable. 

The Use of 'T^ariance Added* ' 

In all these studies, *'ejq>lained variance added" was used as the prin- 
cipal aeasure of the effect of a class of variables^ although in two of the 
attidlas an additional measure was used: standardized regression weigjhts in 
the reading study ^ and unique contribute. on to explained variance in the 
literature study. The use of this measure is dictated in part by the fact 
that it is one of the few easily obtainable measures of the effect of a 
tat of variables^ as distinct frost the effect of a single variable, for 
vhid) regression coefficients and standardized regression coefficients are 
available. 

Tha limitation of this neasure, in terms of what it tells about the 

affects of the block of school variables, were discussed above* But even 

aand their children to private schools, or tutored them at home, while 
families with little money or interest in their children's education, and 
little capability of teaching then (mostly black) neither sent their 
children to school nor tutored them. The results were greatly increased 
differences in cognitive skills as a function of family background - indi- 
cating through their absence the effect that ijniform societal forces 
outside the family have in reducing the effect of family variations on 
achievenient. But in the present studies, the only children tested were 
diildren in school, and except at the highest age level, nearly all the 
children in the population were in school^ except in Qiile (age 10« 94%, 
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accepting the use of a measure to irhcw the effect of a block of variables 

as distributed, over and above the effect of Bloch 1 and 2 variables, the 

question is, is it the correct neasure? I think not. 

^"Explained variance^* (R^) is the square of the multiple correlation 
2 

coefficient (R) and R has come to be fashionable as a measure of the 
power of the set^of independent variables in explaining variation in the 

- dependent variable. "Explain'^d variance added'* by a set of variables 

2 

then becomes the difference between R of the rccression with these var* 
2 

iables and R of the regression without them, fty point is that all this 

0 , 

is mistaken, and instead that the nsultiple correlation coefficient itself 
Should ba used, that is, R instead of R , and the "correlation added," or 
*%aplained variation added,** rather than ''eJ^lained variance added*' be 
used as a measure of the contribution of a set of variiibles over and above 
the effect of another set. 

In axfuing for measures other than the added explained variance or 
wique variance, I must point out that I have until vexy recently used 
these Bieas\ires in all ttf work, and in Equality of Educational Opportunity, 
we used then. Althous^ I still believe these measures are sxiperior for 
the present purpose to others currently in use, I think they are not as 
good as the alternatives I an proposing here. 

Hie basis for this argument is rather straightf onward. Consider a 

set of three regression equations such sis the ones below for England in 

Literature (p- 165). The variance explained by a given set of 

2 

variables, s, is denoted by R , and the variance explained by that set 

2 

excluding Block i variables is denoted by R^^^ . 

age 14, 71%0, India (age 10, 50\, age 14, 25%), Iran (age 10, 75i, age 14« 
2S%)« Italy (age 14» SS%), (Science, p. 57). Consequently, such total 
societal forces could not be measured. 
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(variance (nult. (variance (variation 









«:xplained) 


correl.) 


added) 


added) 


1. 


Blocic 


1 (home background) 


.252 


.502 


.252 


,502 


2. 


Blocks 


1 and 2 (♦ type of school S program) 


.371 


.609 


.119 


.107 


3. 


Blocks 


1, 2, 3 (♦ instructional variables) 


.410 


.640 


.039 


.031 



The measures used in the studies (R and R - R . ) are reported, along 

S S S — X 

• with the measures I propose, R^ and R^- R^ ^. Nov suppose, looking first 
at the first equation, we consider an atten^t to explain cognitive achieve- 
■ient in literature by a single con^osite variable, "home background." 
That composite variable is a linear cosfbination of the variables that go 
Into it which nininiies the sun of squared deviations of the predicted 
dependent variable from the actual one. That is, it is a composite vari- 
able which is a weighted sum of the various home backgrowd variables, 
the weights being the regression coefficients themselves. Now since we 
conceive of such a variable, ^fione background,'* we can ask what would be 
the expected performance of a child at a given percentile of home back- 
ground - say the 25th and 75th percentiles? What is meant by the '75th 
percentile of home background'* is a family background low in those resources 
that contribute to a child's performance in literature. The particular 
cori>ination of resources might differ: one child with a family background 
at the 25th percentile might have luw father's education and high mother's 
education, while another might have high father's education and low 
■other's education* The two backgrounds are equivalent only in that on 
the average, homes with these two combinations of resources both produce 
children at the 25th percentile of the distribution of predicted perfor- 
mance. 

^ Now the answer to the question of what would be the expected perfor- 

ERIC 
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mance of a duld at the 2Sth and 75th percentiles of family background is 
not, as one might at first glance expect, the 2Sth and 7Sth percentiles 
of actual performance. It is, rather, the 2Sth and 75th percentiles of 
piKdicted performance, which has a smaller standard deviation than the 
distribution of actual performance- The ratio of the standard deviation 
of predicted performance to actual performance is siirply R^, the multiple 
correlation between achieveTnent and the home background variables in 
Block 1. Thus if the average perforraance is 50, ann the standard deviation 
is 10, then assujziing a nomal distribution, a child at the 25th percentile 
will score 43.26 while a child at the 75th percentile will score 56.74, 
that is 6.74 points below and above the average, respectively. A diild 
In England at the 25th percentile of home background will score at SO - 
6.74 or 46.61, while a child at the 7Sth percentile of home background 
•fill score at 50 ♦ 6.74 R^, or 53.39. That is, if we know that the score 
for e given percentile is a certain distance below or above the mean, then 
the score for that same percentile on this composite variable will be Rj 
times that distance - in this case, 50.2 percent of that distance. Thus we 
can reasonably say that this composite variable, home background, accounts 
for S0.2 percent of the variation (not variance) in achievement. What is 
true at the 25th percentile is txve at all percentiles - the child at a 
given percentile on home background is times the distance from the mean 
in achievement that the child at the same percentile in performance is- 
Or to look at it differently, if we know the difference in achievement score 
between any two percentiles (a difference of 13.48 points for the 2Sth and 
7Sth in the exaniple above), then the difference in achievement score of the 
average child with a home background at those same two percentiles will be 
R- tines that difference, or 13.48 (.502) • 6.78 in this case. 
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It Is useful also to recognize that is the standardized regression 
coefficient 9 J of the composite variable home background when no other 
variables are controlled (i.e., including effects of all variables through 
which it is correlated). Ordinarily, one does not use standardized regression 
coefficients in single-variable regression analysis, however, this identity 
between Rj and will be inportant to the later discussion. 

By extension, the saine argument given above now holds if we create a 
cos|>osite variable from home background and type of program and school^ that 
is, froB the Block 1 and Block 2 variables conbined. We can ask the question 
about two children at the 75th and 25 th percentile of a new compound vari- 
able« Bade up of home background and type of prograa and school: what 
proportion of the total distance above the mean to the 7Sth percentile is 
the child who is at the 7Sth percentile in this cosibination of home and 
school resources 7 That proportion is K^2» vhere multiple corre- 

lation of achievement with home background and type of program and school* 
R^2 ^C^^^ * standardized regression coefficient ^^2' composite vari* 
able nada up of Blocks 1 and 2, when no other variables are controlled. 
Of course it nay not make much sense to define a new variable as ^lioaie 
and school resources'' (where we mean by school resources type of sdiool 
and type of program). If not, then the usefulness of great. 
But in any case, our interest is not so much in the explanatory power of 
the best cotiA>ination of home and school resources as in the explanatory 
poner of school resources themselves^ apart from home resources. The 
natural thing to do then seems to be to subtract out the variation that can 
be explained by home alone^ and consider only the additional variation that 
is accounted for by the Block 2 variables « that is, R^^* thinking in 

terns of standardized regression coefficients « ^^2* ^1* ^ 
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what this tells lis. it tells it> the ad<^itional explanatory po^^cr uf thr 

best coiopound variable made up of home resources and school type and 

program, beyond the explanatory power of the best compound variable of 

hone resources by themselves. What this tells is the additional povrer 

to explain achievement brought in through type of school and program that 

is distributed independently of home background, that is, effect of type 

2 2 

(3) discussed on page 14 above, the kind of effect that '^i ci^^sures 

imperfectly. But it <*iOos not tell us in any Kay the total direct effect 
of those Block 2 variables, that is, the effect of type (2) on page 14 
above. What would such an effect be? Ke can think of an hypothetical 
axparisent sonething as follows: suppose for two children, the Block 1 
vari^les, home background, were at their average position for the popu- 
lation. Then if for one of the children, the Block 2 variables were held 
at the population average, his predicted score would be exactly at the 
population average. But for the other child, the Block 2 variables are 
put at their 7Sth percentile level. What we mean by '"their 75th percentile 
level** is this: in this equation, including the Block 1 variables, we find 
the linear combination of Block 2 variables that in the presence of Block 
1 variables has the hi^est partial correlation with achievement. This 
becomes for us a new variable which we labeled *'type of program and school," 
or ••Block 2". Then we find the 7Sth percentile position on this new vari- 
able C^ich we can assisme is normally distributed, for purposes of the 
hypothetical experiment and the measure that this experiment is leading 
toward). And the second child is at that 75th percentile level, along 
with his SOth percentile family background level. Our question then becomes, 
what is the predicted achievement of this child at the SOth percentile of 
family background (Block 1} and the 75th percentile of Block 2? Ftirther^ 
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thinking of the diffsren^^e between the average score (SOth percentile of 
Block 1 and SOth percentile of Block 2) and the 75th percentile score (which 
is sinply .6745 tines the standard deviation of achieve iient)^ what propor- 
tion of that distance is covered b; the score of our second child? Whatever 
laeasure will give us the answer to that question is the measure of the 
total direct effect of the type of program and school compound variable. 

Ke can easily see how, if we truly went to the trouble of making up that 
conpound variable as described above, such a measure would be directly 
forthcoming fron the regression equation. We can see how by directly cal- 
culating the three scores in question: 

1) 75th percentile score: y ♦ .674*^ 

2) SOth percentile or average score 

(with average Block 1 and Block 2 variables) y 
S) predicted score of a child 

irith averages Block 1 variables ix^) and 75th percentile on Block 2 
variable (X2) 

If the regression equation is y= a ♦ bj * ^2 ^^^^ predicted 

score y*, 

* * *»i.2*i * ^2.1 * •^''^^^ 

But if J is the raw regression coefficient of Block 2 variables, controlling 
on Block 1, and ^2 i standardized one, then ^ , * ^2 1^' 

Therefore, the proportion of the distance between scores (1) and (2) covered 
by (3) is: 

y ♦^2,i(.674<^) -y ^ 
ERIC 7^.674<^-7 
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Thus the desired measure, showing the, prcportin of variation in achievement 
that the Blc^k 2 variable will explain when Block 1 variables are held 
fixed is merely the standardized regression coefficient in the equation 
containing Block 1 and Block 2. 

The approach taken here toward carrying out hypothetical experiments 
with blocks of variables can be carried further. Some exajuples will show 
the generality of this approach toward measures of effects in regression 
analysis^ and will make evident that it should not be practiced merely 
through blind calculation of standardized regression coefficients without 

regard for precisely what is the desired measure. One hypothetical exper* 
inent would be to ask the achievement between a child who is at the 25th 
percentile in home background (Block 1)> and at the 75th percentile in 
school type and programCBlock 2). Is this score above or below the mean, 
end what proportion of the distance from the mean to the 75th or 25th 
percentile is it? The score is: 

\.2\'^l.2<^' ^2.l4c/?2.1<^r' ^« ^'^^^ 

If the quantity in parentheses is positive, the score is above the mean. 
The score at the 75th percentile is y ♦ •674(|p, so that the proportion of 
distance from Y to the 75th percentile is 



7 ♦ .674C^ • y 

If the quantity is below the mean, the proportion of the distance to the 2Sth 
percentile would be calculated in the same way, but would be 2"^ ^2 1' 



BtST COPY RVAILABLE -^5- 



A second hypothetical experiment would be this: suppose a child is one 
standard deviation below the mean in home background (Block 1) resources^. 
aiW at the mean in school type (Block 2). Then how many standard deviations 
above the mean should his school resources [Block 3) be to make his predicted 
score be at the mean? This is answered be setting up the appropriate? 
equation with the desired nuirber of standard deviations represented by an 
unknown,^. 

y. a ♦ 23Cx^ - d^) . ^^^^^ * Ku^h ^^^x^^ 
" * ♦ * ^2.13^2 * ^3.12^3 ' ^.23<^x^ ^'^h.U^ 

■^.23<^j *^^3.12^2 
■?1.23<V **^?3.12<V 

§3.12 

Tliese hypothetical experiments show the generality of naniplulations among 
Standardized regression coefficients as a way of measuring the effects of 
changing various independent variables by some standard distance related 
to their distribution in society^ such as one standard deviation. 

It is important to recognize that even if we had the standardized 
regression coefficients, they would be no panacea* It tells us only the 
effect of type (2) on page 14. Suppose it is empirically the case that 
children in the high-performance schools and programs (Block 2) are nearly 
always from good hone backgrounds (Block 1). Then it is quite artificial, 
in one sense, to perform the hypothetical experiment discussed above, for 
seldom does a child from an average home background attend a 75th percen- 
tile school. To know the potential for higher perfoxmance of a child in 
a selective secondary school with average home background is of purely 
academic interest if it infrequently happens. Vlhat is of greater interest 
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is the amount of variation that the schools in fact account for indepen- 
dently of the family, that is, R^j- Rj. ^'^Piz^ Pi' there is not 
such a high correspondence between Block 1 and Block 2, the potential 
effect is in fact realized for some children, and it is of more than 
academic interest.* 

. In the above discussion, it was assumed that Block 2 consisted of a 
single conipound variable. But if a bloc!: consists cf a number of vari- 
ables, how do we get a measure for the overall impact of the block of 
variables, considered conceptually as a coi^pound variable (e.g.t **school 
variables," 'Instructional variables," or "learning conditions," as the 
authors termed Block 3 variables), but in practice left as a set of vari- 

ables? One of the appealing aspects of "added explained variance" 
2 2 

i\ • Re 4 ) "added explained variation", (R - .) is that it serves 

3 S*** S S — 1 

as a global measure for the whole set of variables that are entered in a 
given block. Since the blocks have a coherent meaning or interpretation 
it is useful to have a measure, comparable to the standardized regression 
coefficient, for the block considered as a compound variable, even though 
it is a number of separate variables. This laeasure would tell us the 
proportion of variation that Block 2 variables, considered as the best- 
predicting coBpouid, will explain when all Blczk 1 variables are held 

fixed. But it turns out this is not quite so simple to obtain as the 

2 2 

added explained variance, * t or the added explained variation, 

Rjj- Rj. Conceptually, there is no problem. In fact, one way of defining 

* I have discussed Block 2 variables as if they were school variables that 
had an effect on learning. In fact, as the authors point out, they are 
primarily indicators of differential selection of differently-performing 
students into different schools, and thus surrogates for unmeasured 
variations in student input into the schools. They show not how much 
learning takes place in the schools, but how much selection. 
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^2 J gives an interesting inri.jat imo ;!ic conceptual difference between 

the ^*added variance" and this measure. The added variance for Block 2 is 
2 2 

Rj2 " » square of the standardized regression coefficient ^2 i 

is this same quantity divided by one minus the square of the correlation 
between the Block 2 conjpound and the Block 1 compound, which we will label 
R^j. (In the equation below, we will introduce the subscript 4 for the 
dependent variable of adiievenent, to prevent confusion.) The equation 
for S^^ is : 

^42.1 ~ '^4.12 ^^41 
1 - 14,2 

The division by 1 - ^21^' which is the only difference between ^ and added 
variance^ shows the conceptual difference between the two. For 
proportion of variance in the Block 2 variable compound that is accounted 
for by the coiq>ound of Block 1 variable (not a compoimd designed to best 
explain the Block 2 coltipound» but the compound based in regression weights 
in the equation with achievement as dependent variable^ and Blocks 1 and 2 
as independent). Thus the added variance is merely the square of the 
regression coefficient discounted by the variance that is common with the 
Block 1 compound. 

To calculate this standardized regression coefficient » it would be 
possible to first carry out the full regression of achievement on Block 1 
and 2 variables » then to create a new compound variable from the Block 2 
variables, by using the regression coefficients as weights^ recalculate 
the correlation matrices including the newly-defined variable^ and then 
carry out another regression analyses « in which the new compound Block 2 
variable replaces the set of variables from which it was compounded. The 
new regression equation is identical to the preceding one, in total variance 
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explained and in regression coefficients for the unaffected Block i vari- 
ables. For the Block 2 compound, the raw regression coefficient and the 
standardized coefficient are identical, and are the desired laeasure: the 
proportion of variation in achievement that will be accounted for by 
Block 2 variables when Block 1 variables are held fixed. 

This kind of recalculation is, however, technically curabersome. It 
involves recalculation of the covariance matrix every tiine a new compound 
is created. Instead of that, it is possible to proceed in any of several 
ways that make it possible to calculate these standardized regression 
coefficients without recalculating the correlation matrices. I will discuss 
two such ways in the Appendix 2, but here it Is stafficient to note that if 
we have the multiple correlation between the Block 2 variables alone and 
achievement, as well as the multiple correlation between Block 1 and achieve* 
sent and between Blocks 1 and 2 and achievement, it is possible to calculate 
the standardized coefficient directly. For in the equation given earlier, 
only 22 » \i • '*21 necessary to calculate y \2 
R^j are multiple correlations of 1 and 2 with 4 and 1 with 4« respectively. 
Rjj is not easy to obtain directly, but if R^j* *^bich is easy to obtain, is 
kn<Mi along with ^^^^ tnay be calculated from them, as 

described in the appendix. The three multiple correlations necessary to 
calculate Rjj are not given in these studies, but two of them are, and it 
is possible to calculate reasonabli^ upper and lower bounds on the desired 
standardized regression coefficient, by the method described in the appendix 
2. Thus in these studies, we can approximate the desired standardized 
regression coefficient ^2 i Block 2 variables by 
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and for Block 3 variables 



^3.12 *^ ]I^125^' ^U^ (1 ♦ 1_ ) 

2 ]/l 



This will give us estimates of the total direct effect of Block 2 and 3 

variables to complement the direct effects distributed independently. It 

is interesting that although this measure is very different from the added 

explained variance, it can be roughly estimated from the same two quantities, 
2 2 

and , which are used to calculate the added explained variance. 

Now it becomes possible to indicate what w'>uld be the approximate 
Masuxes for the four kinds of effects discussed for Blocks 1 and 3 earlier 
in the paper. The notations used will be $ii)script i for Block i, vith 
subscript 4 denoting the dependent variable, achievement; R^. as the 
■ultiple correlation of Block i alone with achievement, R. ... as the 
■Ultiple correlation of Blocks i.j.k with achievement; as the 

41 .JK 

stan<1ardized multiple regression coefficient of achievement on the Block 
i variables, considered as a con^jound, when Blocks j and k are controlled. 

1. Total effect of variations in Block 1 variables, both through 
through sdiool and outside it: R^j 

2. Direct effect of Block 3 variables, whether this effect implements 
the force of home and school type through its distribution, or 

not:p^3 21 

3. Independent direct effect of Block 3 variables as distributed, in 

iaplementing societal variations beyond the home: R. R- ^- 

4*321 

^■?4.32rP4*2l5 

4. Direct effect of Block 1 variables, apart from their effects through 
jhaping or selecting schools: 
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There is one iciportant point that should be made about the virtues of 
the **added ejqplained variance*' or ''unique variance" ineasures used in these 
studies. Although they are not estimates, as are the measures I have 
proposed here, of the amount of variation explp/ined by particular variables 
or blocks of variables, they have some special virtues. This is best seen 
by writing again the equation for^^^ 



2 

2 



o 

ERIC 



Now the added, or unique variance is simply (1 - Rj 12^) • that is, 

the square of the standardized regression coefficient, the measure I have 

proposed, tines the variance in Block 3 (or variable 3) which is not 

accowted for by the other independent variables, 1 and 2. The crucial 

difference between the two is not the square vs. non-square, but the 

discounting by 1 • Rj j^j^ ^<^t. The regression coefficient ^ in effect 

says to take the (square root of) the added explained variance, but to 

2 

take into account that a portion of the variance of variable ^•'^322' 
had no chance to be effective, because that variance coincided with the 

2 

variance of variables 1 and 2 whidi already accounted for variance R. . 

4. 12 

Thus it multiplies the (square root of) the added explained variance by 
a ratio l/(^ - Rj 5» which inflates |^P^ 123^' '^4 12^ ^^^^ 
what its estimated value would be if variables 1 and 2 were not correlated 
with it. 

2 

The only difficulty with this procedure is that if Rj high* then 

this constitutes a great inflation, and in effect a large extrapolation of 

the variation it has explained - an extrapolation that could be mistaken. 

2 2 

For this reason, added explained variance, 12 what is 
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2 

called in Literature b"/c for single variables) is a valuable conservative 

statement about the effect of variable or Block 3. It is particularly 

valuable for estimating the relative effects of different school variables 

which have different correlations with home background variables, and for 

2 

which the extrapolation due to Rj ^2 Bright be very different » and possibly 
a bad extrapolation to use It is used in this way in these studies (par- 
ticularly Literature) for the examination of individual variables. However, 

2 

its depressed value due to \2 ' '^''^ squared rather 

than the square root form, leads to an unwarranted pessimism about the size 
of the effects. 

I would prefer that the square root be used, because that would have 
an explicit meaning in terms of variation of the dependent variable analo* 

I 2 r 

fous to those described earlier. The meaning would be thisrWR^ * % 12 * 
or equivalently j2 (^1 - Rj ) , is the proportion of the distance 
between scores of the student at percentile A on achievement and the score 
of the student at percentile B which would be covered by changing the Block 
3 composite variable for a student from whatever percentile it was at when 
the Block 1 and 2 coirposite was at percentile A, to percentile B. Clf there 
is a strong correlation between Block 3 and Blocks 1 and 2, the original 

percentile of the Block 3 composite will already be some distance to percen- 

2 

tile B, because of the correlation. If not, Rj ^2 close to zero, then the 

Block 3 composite will originally be close to percentile A.) 

Thus the use of added explained variance, but in its square root form, 

consitutes a useful stater.ent about the effect of B lock 3 variables 

1 2 2 

as distributed. There could be some argument thatWR^ * 12 
preferable to ^23" ^4 12 ^ ^ measure of the effect of Block 3 as 
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distributed, independently of Blocks 1 and 2. These two measures vrould 
give quite different estinates. It would be useful to explicate exactly 
the operational differences (in terms of hypothetical experiments of the 
sort discussed earlier) between the two measures, but limtations of time, 
space, and purpose of this paper prevent that here. 

Now fror\ the three studies under consideration, we can, with the 
published data, cbtain ene nersuro of the effect of school variables (*3 
above) and an approxirnation for the other (^2 above). The measure " 
% 12 ^^^123" ^12 ^^^^^ direct effect of school variations on achieve- 
Bent that is distributed independently of home background and type of 

school, and the approxirtation ^o^^ which is a function of Rj^23^ ™^ 
2 

R^2 shows the direct effect of school variations on achievement, including 
both those that are an indirect consequence of the school working throu^ 
the home, and those that are independent of it. We will call the first 
''independent explained variation" and the second ''total direct explained 
variation". The total direct explained variation will always be larger 
than the independent explained variation, because it includes the latter^ 
Ihe independent explained variation will be larger than added explained 
variance under some circumstances, smaUer under others. Algebra will 
show that added explained variance exceeds independent explained variation 
when R|23* ^^^^^0. Thus when ^^2Z ^^12 greater than about 0.5, , 
the added explained variance will be greater. The added explained vari^ 
ance of course does not have such a straightforward meaning as indep.'^ndent 
explained variation or total direct explained variation. 

It is useful to see what happens to the countries wKen the two measures 
I have proposed are used for measuring the effect of school variables, in 
place of the measure used by the authors. T.e tabulation is given below 
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for Population 11, Literature (p. 163)- (All measures are acultlplied by 
100 0 



^12$ ^ ^12 

Added Explained 
Variance 







value 


rank 


Belciun (Flenish 


-spcaVinn) 


6.9 


6 


Be 2 gi urn (frcncli- 


speaking) 


9.0 


2 


Chilo 




8.9 


3 


Engl am d 




1.9 


9 


Finland 




5.5 


8 


Iran 




12.1 


1 


Italy 




3.0 


10 


New Zealand 




8.7 


A 


Sweden 




6.1 


7 


United States 




' 7.7 


5 



^^123' 


^2 






Indcp. Explained 


Total 


Direct 


Variance 


Expl. Variance 


value 


rank 


value 


rank 


6.4 


7 


28.4 


5 


8.3 


3 


32.3 


1 


8.9 


2 


31.7 


3 


3.1 


10 


22.3 


9 


5.0 


8 


25.5 


8 


17.3 


1 


27. J 


6 


3.3 


9 


18.3 


10 


7.5 


S 


32.3 


1 


7.2 


6 


25.8 


7 


7.8 


4 


29.5 


4 



This tabulation shows that the neasure of independent explained variation^ 
Rj2j- Rj^2 close to the added variance measure used hy the authors. The 
second measure, of the total direct explained variation In achievement by 
sdiool variables » is rauch larger* and ranks the countries much differently. 
htm the t0o measures I have proposed can be compared, because they are 
both measures of the proporatlon (or miiltiplied by lOOj percentage) of the 
difference between any two percentile scores that is accounted for by the 
saae percentiles in the independent variables, considered as a best-predic- 
ting conposite. The measure of independent explained variation is the 
effect of these school variables, as distributed, independently of the 
hone backgrounds that partly determine their distribution. The measure 
of total direct e^lained variation is the total effect of these school 
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variables if they were distributed iridepesi.eatly of howe background. 

Xhe first measure may be thought of as the actual effect of the 
sdiools in interrupting the transmission of home background resources 
across generations, or the effect of the schools in equalizing edu- 
cational opportiinity . The second, as the total effect of the schools* 
is the potential the schools have for doing this if they were in fact 
distributed independently of home backgrounds. Countries can be com- 
pared according to the proportion of their schools^ total istpact that is 
distributed Independently of honie background resources - and, of Course* 
tfpm of school and program, which is not itself distributed wholly in 
accord with home background. Such a comparison, using the sane coun- 
tries and same age group as before is shown below. Except for Iran, 





Total Explained 
by Block 3 


Proportion ! 
dent of Bl0( 


BelgitBi (Flemish) 


26.4 


.23 


BelgiuB (French) 


52.3 


.25 


Qiile 


31.7 


.28 


England 


22.3 


.14 


Finland 


25. S 


.20 


i.ran 


27.1 


.63 


Italy 


16.3 


.16 


New Zealand 


32.3 


.23 


Sweden 


2S.8 


.28 


United States 


29.5 


.26 



where a very large proportion of tbe schools^ inpact is independent of home 
background (though so little variation is explained by hone background in 
Iran ^21, much less than in any other countr)^ that there may have 
been serious measurement problems), only about a quarter of the school var- 
iables* influence is independent of home background and type of school. 
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Ajnong all countries, Englsnd is lovest. This accords with quplitarivc 
iv^ressions, according to which England's schools arc still nore strongly 
class stratified than in roost countries. As we will sec in subsequent 
tabulations, Frgland is consistently lowest in this neasure of equalizing 
opportunity tl»rouRh the schools. 

Now, finally, it is possible to get on with the t.-isk of examining 
tJic ir^ilj cations of ilicsc findings for our knowledge of effects of the 
school on learning. In doing this, I will focus on only those six countries 
that were covered in all three studies: Q\ilc, England, Finland^ Italy, 
Sweden, and the U.S. 

II. S^ibstantivc Results 
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Between -school and Between-student Analyses : 

In all three studies, there were regression analyses carried out in 
which the total variation to be accounted for was that between school 
averages, and regression analyses in which the total variation to be 
accounted for was the full variation between individual students. Which 
should be used for assessing the effects of school variables? 

The between-schools analysis, by standardizing the variance between 
schools to 1.0, prevents one question from boing asked: what is the overall 
impact of measured school variables upon achievement, compared to all the 
other variables that make some individuals achieve more highly than others? 
Since the strength and distribution of school variables will partly deter- 
mine the amount of variation among schools, the standardization of the 
between school variance to l.O eliminates a portion of the effect that 
school variables have. Further, this unknown portion may vary from country 
to country. Only if one used raw regression coefficients to evaluate the 



cffects of schools would the r.easure of effect be coipparable in the between 
schools and between-students analyses. But because one is interested 
in the effects of clusters of school variables (partly because any single 
variable has a very small effect), for which raw coefficients cannot be 
used, then measures much as those discussed in preceding sections are 
necessary - 

For this reason 5 jn cxainining overall effects of school variables, 
the between-student analysis should be used. This analysis has the defect 
th*t it includes a lot of variance due to individual differences within 
schools » which cannot be explained by variables that arc constant for a 
school. Thus the effect of school variables appears especially small. 
Tills Is one reason bctween-school analyses are carried out. However, for 
the reasons discussed above, it appears berter to recognize that school 
variables can account only for that fraction of the total variance that 
lies between schools, and to use the between-student analysis for the study 
of effects of school variables. 

In use of the between-student analysis, I will focus on the six coun- 
tries which engaged in testing in literature, reading, and science. These 
are Chile, England, Finland, Italy, Sweden, and the United States. In 
addition, I will confine my attention to the 10 a^nd 14 year-olds. Popula* 
lion IV, the last year of secondary school, is such a non- random sample 
of the population of the age cohort that an examination of the amount 
of variance or variation accounted for by differences among school re- 
sources is not likely to prove very productive. Beyond this, literature 
was not administered for the 10-year olds. Altogether, I will examine a 
very partial set of data. 

For these six countries, I have calculated the measures described 
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above» wJiidi I w:!* res':ate htrc: 

1. The total effects of home background Rj 

2. The total direct effects Of school type and program ^2 1 

3. The effects of school type and program distributed 
independently of home background R^^' 

4. The proportion of school type effects that are 

dilitributed indepcnde:it ly of hoine bacl^ground ^'^12'^ ^{^^^ 1 

5. The total direct effects of school variables ^2 

6. The effects of school variables distributed indepen- 
dently of home background and school type and program ^123* ^12 

7. The proportion of school variable effects that arc 
distributed independently of home background and 

school type and program ^123* ^12^^ 12 

Of these measures^ I will neglect 2» 3» and 4, having to do vith 
school type and program^ because the "effects" of school type and program 
arc primarily effects of selection of differently-achieving students into 
different programs or schools* The school ef fects > insofar as they exist, 
arc to be found in the specific school variables, while the school type is 
Bore a consequence of achievenent than a cause. 

Table 1 and 2 show measures 1, 5, 6, and 7 for literature, reading, and 
science for these six countries, for the 14 year-olds, and for reading and 
science for the 10 year-olds. The first table shows, for example, that in 
Oiilc 38% of the variation among 14 year-olds' achievement in literature is 
accounted for by the total effect of home background, including both direct 
effects and effects through the schools. In the same country and subject^ 
9% of the variation is accounted for by school variables that are distributed 
independently of the home background and school type variables. 
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Table 1 

Population IT: age 14 





Chile 


England 


Finland 


Italy 


Sweden 


U.S. 


Averai 


Total home background effects 
















Rj= Literature 


.38 


.50 


.43 


.33 


.39 


.43 


.41 


Reading 


.45 


.52 


.45 


.32 


.40 


.47 


.44 


Science 


.36 


.48 


.47 


.32 


.42 


.47 


.42 


Average 


.40 


.50 


.45 


.32 


.41 


.46 


.42 


School Effects distributed 
















independently of home 6 school type 














R|2j* Rj2* Literature 


.09 


.03 


.05 


.03 


.07 


.08 


.06 


B •A Reading 


.06 


.02 


.04 


.03 


.04 


.06 


.04 


Science 


.07 


.05 


.09 


.07 


.08 


.07 


.07 


Average 


.07 


.04 


.06 


.05 


.06 


.07 


.06 


Total direct school effects 
















?3.12 Literature 


.52 


.22 


.26 


.18 


.26 


.30 


.26 


Reading 


.26 


• 19 


.2i 


. 19 


. 18 






Science 


.26 


• 30 








10 
• ^5 




Average 


.28 


.25 


.28 


.21 


.24 


.29 


• 2o 


Proportion of school effects 
















distributed independently of 
















hone and school type 
















R|23" Literature 


.28 


.14 


.20 


.18 


.28 


.26 


.22 


^ Reading 


.22 


.11 


.16 


.17 


.20 


.21 


.18 


Science 


.26 


.18 


.25 


.29 


.27 


.24 


.25 


av(Ri23- R12) 


.25 


.14 


.23 


.22 


.26 


.24 


.22 



'^^.21 
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Table 2 



Population I: age 10 





Chile 


England 


Finland 


Italy 


Sweden 


U.S. 


Average 


Total home background effects 
















Reading 


.12 


.47 


.42 


.31 


.34 


.45 


.35 


Science 


.20 


.46 


.37 


.20 


.40 


.42 


.34 


Average 


.16 


Aft 




2fi 








School effects distributed 
independently of home ^ school type 














Reading 


.17 


.02 


.03 


.06 


.04 


.04 


.06 


Science 


.16 


.03 


.05 


.08 


'.06 , 


.09 


.08 


Average 


.17 


.02 


.04 


.07 


.05 


.07 


.07 


Total direct sdiool effects 
















• 

Reading 


.29 


.13 


.18 


.22 


.18 


.21 


.20 


Science 


.30 


• 18 


.21 


.20 


.23 


.32 


.24 


Average 


.30 


.16 




.21 


.21 


.26 


.22 


Proportion of school effects 
distributed independently 
of home and school type 
















Rj2j* Rj2 Reading 


.58 


.12 


.18 


.30 


.23 


,.19 


.27 


D Science 


.49 


.18 


.24 


.41 


.25 


.29 


.31 


- av(R,23- 


.56 


. 15 


.21 


.35 


.24 


.25 


.31 



av 
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But the total direct effect of the school variables is such they account for 

SZ% of the variation in literature achievement when home background asd 
school type are controlled, so that the 9% constitutes only 28% (9/32) of 
the total school effect. The remaining portions of the two tables can be 
• read in the same way. 

First, several general results are useful to state. For all three 
subjects, the total effect of home background is considerably greater than 
the total direct effect of school variables* The overall average is .42 
for home background, but only .26 for school at age 14. For population I, 
the 10 year-olds, school variables are hi^er relative to hone background 
(.22 to .35 for the overall averages), but because of measurement differ- 
ences for home backgroxmd at the two ages, this night not be a true differ** 
ence. These comparisons show a much higher relative effect of school 
variables to home background than is ordinarily reported - partly, as I 
indicated earlier, because of the methods of reporting, which report 
something like the second set of rows, the effect of school variables 
distributed independently of home background. As the fourth set of rows 
shows, this independently distributed effect of school variables is only 
about 20 to 30 percent of the total, showing that most of the variations 
in school resources go to reinforce home background rather than to cross- 
cut it. 

Conparisons between ages 10 and 14 show^ in addition to the home 
background differences discussed above, a slightly higher total direct 
effect of schools at age 14 than at age 10 (.26 to .22 cn average) but 
a smaller proportion of it independent of home background C*22 to .31 
on average). 

Going beyond ^ese overall comparisons, there are a nximber of infer- 
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ences that one can draw from the resuivs shown in these tables. One of the 
nost interesting concerns a coinparison between reading, on one hand, and 
literature on the other. Looking at the averages over countries shows 
several things. First, home background explains slightly more variation 
in reading than in either literature or science for the 14 /earmolds, or 
* science for the 10 year-olds. But if wc examine the total direct effects 
of school at both age levels, scliool variables account for sor^evrhat less 
variation in reading achievement than docs literature or science at age 
14, or science at age 10. Furthermore, looking at the last set of measures 
for both J4 year-olds and 10 year-olds shows that a smaller fraction of 
these (smaller) school effects is distributed independently of hone back- 
ground for reading than for either of the other two subjects. 

Ihese data show fairly conclusively, then, that reading achievement is 
■ore fully an outgrowth of home influences than are either of the other two 
subjects, less a function of what takes place at school. This is a rather 
important result, because it indicates that the general finding in this 
study and o'chers that home background is a much more powerful influence 
than school influences in determining achievement is a result that is 
subject-specific. Not all subjects are alike in the mix of family influences 
and school influences that determines their achievement. 

Beyond these comparisons by age group and by subject matter, there are 
others that can be made about specific countries. England is perhaps the 
most consistent, and because of that, the most interesting to look at. First, 
for both ages, England shows the highest proportion of variation in achieve- 
ment explained by variations in home background C«SO for 14 year -olds, .46 
for 10 year-olds. Secondly, and most interesting, the fourth set of rows 
shows that England has, in every subject and at both ages, the smallest 
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proportion of total direct effects of school variables liistributcc' indepen- 
dently of hoii»e background. This is the case even though, as the third set 
of rows in each table shovirs, Fngland docs not have especially hig^ total 
school effects. The fourth set of rows can be regarded as a measure of 
the equality of educational opportunity for children of different home 
bacV.grounds . The hiplicr the measure, the larger the proportion of variation 
in ac'fucvcncnt due to school resources thnt are distributed independently 
of home background resources. On this, England is the only country that is 
consistently different from (lower than) the others, but Oiile shows an 
exceptionally high figure at age 10, 

These are the major inferences I an able to draw from these studies on 
the effects of school characteristics on learning. The growth scores, the 
retentivity analyses, the proportion of variance between schools (not reported 
in Science) did not lead, for me, to further insights about effects of 
schools on learning. I am sure, however, given the richness of the data 
and the general sophistication of the analyses to which it has been subject, 
that there are other important ones, some already reported in these studies, 
and some which will be the result of further analyses of these data. Alto- 
gether, although survey data are sharply limited in their ability to show 
the effects of school characteristics upon learning, these studies, and the 
data underlying them, will constitute in the years to come one of the most 
iu^ortant sources of insight into the effects of school resources in learning,* 

• I have not here examined the effects of any specific school variables, but 
have left these variables in their original "block'*. *fy reason for doing 
so lies in my belief that survey methods are simply not capable of analyzing 
such fine grain effects of school variables, and that they are \iseful only 
for the more gross questions of the sort I have discussed in this section. 
I should note that in these studios as in most comparable ones (ny own EEOS 
study included), valiant attempts were made to find effects of specific 
school variables, but without consistent results except for some variables, 
such as gradj in school, which are merely indicators of the child's general 
achievement level. 
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Appendix 

Calculation of standardized regression 
coefficient for a compound of variables, 
controlling on another set of variables 

Let the variables be labelled as follows: 

1: a linonr corbi nat ion of Block 1 variiibles, created as the best- 
fitting combination for predicting achicvciiient 

7: m linear combination of Block 2 variables, the best-fitting 
coBbination for predicting achievement 

4: achievenent 

Then i standardized regression coefficient for Block 2 variables 

in explaining achievement, with Block 1 variables fixed. 

22 multiple correlation of Blocks 1 and 2 together with 

achievement . 

R^j is the multiple correlation of Block 1 variables with achievement. 
R^2 multiple correlation of Block 2 variables with achievement. 

Rj2 correlation of the coirpozmd of Block 1 variables with the 

conpound of Block 2 variables. 

% 21* '^41* ^^42 obtained directly as the multiple correlations 

of regressions of achievenient on Blocks 1 and 2, Block 1, and Block 2« respec* 
tively. Rj2 be obtained from these three quantities by use of th^ 
following equation: 
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llaving ^4i» ^42* calculate K^^ ^nd then j. or 

Q 2 2 

alternatively ^42 place of R^^^ in equation (A-1).* 

Another method by which j calculated is to use the regression 

weights from the regression equation including Blocks 1 and 7 to calculate 
zero-order correlations of the new compound variable with all other variables. 
The equation for doing so is given below, where: 

r^j = zero-order corrclntion between variables i and j 

r . = zero-order correlation between variable j and the new compound 
cj 

variable 

« standard deviation of variable i ' 
b; • nultiple regression coefficient of variable i with the dependent 
variable in the presence of variables l,...,n, the set of vari- 
ibles to be labelled s 
Variables l,2,...,n are not to be compounded, and variables n^l,...m 
are to be confounded. 

Then 

^ . i»^l ^,s^ij<^i<^j (A. 5) 

•"Ji^n*! i.s ^i k=i^l i=n*l i.s k.s ik**!*^ 



• It should be noted that this method for calculating^^-, ^, in contrast to 
the one given below, involves an approximation. Its virtue lies in the 
fact that it may be calculated by hand after regression analyses have been 
done, assuming that R^-, and R^- ^^^ve been obtained in the regression 

analyses. The approxiriation lies in tRe fact that the method asstunes that 
**variable 1", and "variable 2*' appearing in the equation that gives 21 
are exactly the same variables which appear in the equations that give* 
and R-^. If the regression weights are the sane relative size for 
variable l"in both equations in which it enters, and for varia>>le 2 in both 
equations in which it enters, then the compounds will be the sane, and the 
assumpation holds. If not, then a slight error is introduced, because the 
compound that givej R^j is designed to maximize it, thus producing a slightly 
higher value of R.. than the compound designed to maximize R^ 91 ^^^^^ 
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The same calculation may be useU for the correlation of the conpound vari- 
able with the dependent variable. Once these new zero-order correlations 
have been calculated, then the new regression, and the desired standardized 
regression coefficient for the new Block 2 compound, nay be calculated. 

These two methods of calculating standardized regression coefficients 
for new coripDunds xay of course be incorporated into ^ computer prograni 
So that calculations for the conpound are autoratically done by specifying 
in advance the coripounds for which standardized coefficients are desired. 

When only R. and R., have been presented, as in the studies under 
4.21 41 

consideration, Then some reasonable bounds for j calculated as 

follofws: 

The minirtupi of Rj^^ ^' that the naximura of the denominator of 
A.l is 1.0, and thus the minimum of j is ^ j^* '^41^' ^ ^ reason- 
able, though not definite, upper bound, we note that R^j is ordinarily 
greater than R^^* since the variables 1 and 2 have been selected to 



correlate with 4, and not with each other. Thus ordinarily,^! -R^j^ 
will be smaller than ^1 "^21^* Consequently, a reasonable upper bound 
for €^^2 1 ^^^^ be ^R^ ^i^- R^j^/^1 - R^^^. Thus the following inequalities 
can be used to obtain an estimate for 1' 



2 

, ^ ^41 



One might then estimate i averaging these two bounds. For the case 
of England in Literature given earlier (p. 163), the estimates are: 
= .252 

%.32l' • 
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.399 > ^.01^ -345 
AZ . 1 

^42.1 ^ -'72 



V 1 - .371 =V.629 ^43.21 > V/.039 



> ^43.21 ^ -'^^ 
?43.21 ^ 



These bounds are not extremely wide, and thus pxx>vide some useful infor- 
aation. 
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